CASIANED: People Attribute Extraction based on Information Extraction

نویسندگان

  • Xianpei Han
  • Jun Zhao
چکیده

In this paper, we describe the people attribute extraction system of the CASIANED team for the second Web People search evaluation (WePS-2). We develop an attribute extraction system based on information extraction. Firstly the attribute candidates for every attribute class are extracted using several different information extraction techniques; then these candidates are verified through classification. The system achieves F-measure 0.309 on the develop dataset and F-measure 0.117 on the final test dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Which Who are They? People Attribute Extraction and Disambiguation in Web Search Results∗

People name search often returns a lot of Web pages containing the strings of personal names. Due to namesake, extracting target person attributes (such as birthday, occupation, affiliation, nationality, contact information, etc.) is expected to be helpful to differentiate documents related to different people and thus group documents related to the same person. This paper presents the methodol...

متن کامل

Open Knowledge Extraction through Compositional Language Processing

We present results for a system designed to perform Open Knowledge Extraction, based on a tradition of compositional language processing, as applied to a large collection of text derived from the Web. Evaluation through manual assessment shows that well-formed propositions of reasonable quality, representing general world knowledge, given in a logical form potentially usable for inference, may ...

متن کامل

Multi-level Alignment for Attribute Extraction in IEPAD

The problem of information extraction (IE) regards automatic generation of extraction programs (also called wrappers). Similar to compiler generator, the core problem is to generate extraction rules. In this paper, we introduce IEPAD (an acronym for Information Extraction based on PAttern Discovery), a system that generalizes extraction patterns from Web pages without user-labeled examples. The...

متن کامل

A review on EEG based brain computer interface systems feature extraction methods

The brain – computer interface (BCI) provides a communicational channel between human and machine. Most of these systems are based on brain activities. Brain Computer-Interfacing is a methodology that provides a way for communication with the outside environment using the brain thoughts. The success of this methodology depends on the selection of methods to process the brain signals in each pha...

متن کامل

A review on EEG based brain computer interface systems feature extraction methods

The brain – computer interface (BCI) provides a communicational channel between human and machine. Most of these systems are based on brain activities. Brain Computer-Interfacing is a methodology that provides a way for communication with the outside environment using the brain thoughts. The success of this methodology depends on the selection of methods to process the brain signals in each pha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009